An Empirical Comparison of Outlier Detection Methods

نویسندگان

  • Rohan Baxter
  • Hongxing He
  • Graham Williams
  • Simon Hawkins
  • Lifang Gu
چکیده

Four outlier detection methods are compared using both publicly available smaller statistical datasets and real-life Knowledge Discovery in Databases (KDD) datasets [1]. The smaller datasets provide insight (via visualisations) into the relative strengths and weaknesses of the compared methods. The real-life large datasets test scalability and practicality of application. We are unaware of previous comparisons of outlier detection methods for data mining applications. A methodology for comparing outlier detection methods is developed and we provide performance benchmarks against which new outlier detection methods

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid Distance-Based Outlier Detection via Sampling

Distance-based approaches to outlier detection are popular in data mining, as they do not require to model the underlying probability distribution, which is particularly challenging for high-dimensional data. We present an empirical comparison of various approaches to distance-based outlier detection across a large number of datasets. We report the surprising observation that a simple, sampling...

متن کامل

Comparison of Outlier Detection Methods in Biomedical Data

In this paper the use of outlier detection methods is discussed. This analysis is an introduction to the use of various methods of outlier detection in medical diagnoses (screening). The authors investigated the usefulness of selected outlier detection methods in the context of detection sensitivity, speed performance analysis and the difficulty of automating the performance analysis by using t...

متن کامل

RODHA: Robust Outlier Detection using Hybrid Approach

The task of outlier detection is to find the small groups of data objects that are exceptional to the inherent behavior of the rest of the data. Detection of such outliers is fundamental to a variety of database and analytic tasks such as fraud detection and customer migration. There are several approaches[10] of outlier detection employed in many study areas amongst which distance based and de...

متن کامل

Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means

One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...

متن کامل

Detection of Outliers and Influential Observations in Linear Ridge Measurement Error Models with Stochastic Linear Restrictions

The aim of this paper is to propose some diagnostic methods in linear ridge measurement error models with stochastic linear restrictions using the corrected likelihood. Based on the bias-corrected estimation of model parameters, diagnostic measures are developed to identify outlying and influential observations. In addition, we derive the corrected score test statistic for outliers detection ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001